18 research outputs found

    Loss Severity Distribution Estimation Of Operational Risk Using Gaussian Mixture Model For Loss Distribution Approach

    Full text link
    Banks must be able to manage all of banking risk; one of them is operational risk. Banks manage operational risk by calculates estimating operational risk which is known as the economic capital (EC). Loss Distribution Approach (LDA) is a popular method to estimate economic capital(EC).This paper propose Gaussian Mixture Model(GMM) for severity distribution estimation of loss distribution approach(LDA). The result on this research is the value at EC of LDA method using GMM is smaller 2 % - 2, 8 % than the value at EC of LDA using existing distribution model

    Implementation and Analysis of Combined Machine Learning Method for Intrusion Detection System

    Get PDF
    As one of the security components in Network Security Monitoring System, Intrusion Detection System (IDS) is implemented by many organizations in their networks to detect and address the impact of network attacks. There are many machine-learning methods that have been widely developed and applied in the IDS. Selection of appropriate methods is necessary to improve the detection accuracy in the application of machine-learning in IDS. In this research we proposed an IDS that we developed based on machine learning approach. We use 28 features subset without content features of  Knowledge Data Discovery (KDD) dataset to build machine learning model. From our analysis and experiment we get 28 features subset of KDD dataset that are most likely to be applied for the IDS in the real network. The machine learning model based on this 28 features subset obtained 99.9% accuracy for both two-class and multiclass classification. From our experiments using the IDS we have developed show good performance in detecting attacks on real networks

    LOSS SEVERITY DISTRIBUTION ESTIMATION OF OPERATIONAL RISK USING GAUSSIAN MIXTURE MODEL FOR LOSS DISTRIBUTION APPROACH

    Get PDF
    Banks must be able to manage all of banking risk; one of them is operational risk. Banks manage operational risk by calculates estimating operational risk which is known as the economic capital (EC). Loss Distribution Approach (LDA) is a popular method to estimate economic capital(EC).This paper propose Gaussian Mixture Model(GMM) for severity distribution estimation of  loss distribution approach(LDA). The result on this research is the value at EC of LDA method using GMM is smaller    2 % - 2, 8 % than the value at EC of LDA using existing distribution model. Keywords:  Loss Distribution Approach, Gaussian Mixture Model, Bayesian Information Criterion, Operational Risk

    Sensing Trending Topics in Twitter for Greater Jakarta Area

    Get PDF
    Information and communication technology grows so fast nowadays, especially related to the internet. Twitter is one of internet applications that produce a large amount of textual data called tweets. The tweets may represent real-world situation discussed in a community. Therefore, Twitter can be an important media for urban monitoring. The ability to monitor the situations may guide local government to respond quickly or make public policy. Topic detection is an important automatic tool to understand the tweets, for example, using non-negative matrix factorization. In this paper, we conducted a study to implement Twitter as a media for the urban monitoring in Jakarta and its surrounding areas called Greater Jakarta. Firstly, we analyze the accuracy of the detected topics in term of their interpretability level. Next, we visualize the trend of the topics to identify popular topics easily. Our simulations show that the topic detection methods can extract topics in a certain level of accuracy and draw the trends such that the topic monitoring can be conducted easily

    Studi Perbandingan Pemilihan Fitur untuk Support Vector Machine pada Klasifikasi Penilaian Risiko Kredit

    Get PDF
    Credit scoring is a system or method used by banks or other financial institutions to determine the debtor feasible or not get a loan. One of credit scoring method is used to classify the characteristics of debtor is Support Vector Machine (SVM). SVM has an excellent generalization ability to solve classification problems in a large amount of data and can generate an optimal separator function to separate two groups of data from two different classes. One of the success using SVM method is dependent on features selection process that will affect the level of classification accuracy. Various methods have done to features selection, because not all the features are able to give best classification results. Features selection that used this study is Variance Threshold, Univariate Chi - Square, Recursive Feature Elimination (RFE) and Extra Trees Classifier (ETC). Data in this study use secondary data from the database in UCI machine learning responsitory. Based on simulations to compare the accuracy of using feature selection method on SVM in classification ofcredit riskscoring, obtained that Variance Threshold and Univariate Chi – Square method can decrease accuracy while RFE and ETC method can increase accuracy. RFE method gives better accuracy. Keywords: Credit scoring, Credit risk, Feature selection, Support vector machin

    Maschinelles Lernen für Text Indexierung: Concept Extraction, Keyword Extraction und Tag Recommendation

    No full text
    Aufgrund einiger Nachteile, vor allem wegen semantischer Fragen wie Synonymie und Polysemie, betrachtet man einige Ansätze, um die Leistung der Volltextindexierung zu verbessern. Der alternative Ansatz umfasst latent semantic indexing, keyword indexing, social indexing (Web 2.0) und linked data-based indexing (Semantisches Web). Das Ziel dieser Dissertation ist es, Methoden des Maschinelles Lernen für die alternativen Ansätze zu untersuchen. Die Einsatzgebiete sind concept extraction, keyword extraction und tag recommendation. Erstens wird eine neue Lernmethode vorgestellt, mit der Konzepte Textinhalten, welche durch vom Benutzer eingegebene Stichworte begleitet werden, extrahiert werden können. Das Lernen besteht aus zwei Ebenen, welche die beiden Arten von Textquellen separat ausführen. Auf der unteren Ebene werden die Konzepte und die Konzept-Dokument Beziehungen von der vom Benutzer erstellten Stichworte durch Nicht-negative Matrix Faktorisierung (NMF) entdeckt. Aufgrund dieser Beziehungen sind die Konzepte durch Wörter von anderen Textinhalten auf einer höheren Ebene angesiedelt. Es wird erwartet, dass diese Methode erfolgreich ist, weil die verborgenen Dokument Strukturen auf Stichwörtern basieren, die von Benutzern kreiert wurden, welcher die semantischen Inhalte der Dokumente versteht. Ein weiterer Vorteil dieses Ansatzes ist, dass das NMF zu einer kompakten und sauberen Dokument Darstellung führt. Andererseits ist die Konzept Extraktion aus Textinhalten durch die Methode der Nicht-negative kleinsten Quadrate (NNLS) sehr viel effizienter als die Methode der NMF. Daher ist diese Two-Level Learning Hierarchy (TLLH) nicht nur sicherer sondern auch effizienter als One-Level Learning Hierarchy (OLLH), das die Konzepte nur aus dem Textinhalt extrahiert. Darüber hinaus kann die Methode reicheren Wortschatz besitzen, weil Vokabeln aus den vom Benutzer erstellten Stichworten mit textlichen Inhalten kombiniert werden. Als nächstes wenden wir die extrahierten Konzepte für die Stichwort Extraktion an. Mit anderen Worten stellen wir ein neues Stichwort Extraktion Verfahren genannt Concept-Based Keyword Extraction (CBKE) vor. Die Grundidee der Methode ist, dass ein Terminus des Dokuments wichtig wird, wenn dieser Terminus auf wichtige Konzepte des Dokuments zugeordnet wird und an sich für das Dokument wichtig ist. Die Flexibilität in Bezug auf die Merkmale der Lerndaten ist ein Vorteil der Methode. Es kann auf Trainingsdaten arbeiten entweder mit oder ohne manuell zugewiesen Stichwort. Schliesslich wird sich dem CBKE auf Inhalt basierten Tag Empfehlungen im folksonomy zugewandt. Die Ergebnisse zeigen, dass die Tag Empfehlungen wettbewerbsf ähige Leistungen in ICML PKDD Discovery Challenge 2009 besitzt.Due to some drawbacks, mainly because of semantic issues such as synonymy and polysemy, people consider some approaches to improve the performance of full-text indexing. The alternative approaches include latent semantic indexing, keyword indexing, social indexing (web 2.0) and linked data-based indexing (semantic web). The aim of this dissertation is to investigate the applications of machine learning methods for the alternative approaches. The application areas are concept extraction, keyword extraction and tag recom- mendation. Firstly, we propose a new learning method called two-level learning hierar- chy (TLLH) to extract concepts from tagged textual contents. This learning method executes separately the existing textual sources, i.e. the user-created tags and the textual contents. At the lower level, concepts and conceptdocument relationships are discovered by non-negative matrix factorization (NMF) algorithm based on the user-created tags. Having these relationships, the concepts are populated by terms existing in the textual contents at higher level. We expect this method to be successful because the hidden document structures are discovered based on tags collectively created by users who understand the semantic content of documents. Another advantage is that the NMF algorithm executes more compact and cleaner data representations. On the other hand, concept extraction from the textual contents is handled by non-negative least squares (NNLS) algorithm which is much more efficient than the NMF algorithm. Moreover, the TLLH approach may have richer vocabularies because it can combine vocabularies from the user-created tags and the textual contents. Therefore, this approach is not only more reliable but also more efficient than the standard one-level learning hierarchy (OLLH) which extracts concepts only from the textual contents. Next, we apply the extracted concepts for a keyword extraction method. In other words, we propose a new keyword extraction method called concept-based keyword ex- traction (CBKE). Its basic idea is that a term of a document is important if the term is associated to important concepts of the document and important itself in the document. The exibility regarding the characteristics of learning data is one of the advantages of the method. This method can operate on learning data either with or without manually assigned keywords. Finally, we apply our proposed CBKE methods to content-based tag recommendations in folksonomy. The results show that the tag recommendations have competitive performances in ICML PKDD Discovery Challenge 2009

    KAJIAN KEMAMPUAN GENERALISASI SUPPORT VECTOR MACHINE DALAM PENGENALAN JENIS SPLICE SITES PADA BARISAN DNA

    No full text
    Study on Generalization Capability of Support Vector Machine in Splice Site Type Recognition of DNA Sequence. Recently, support vector machine has become a popular model as machine learning. A particular advantage of SVM over other machine learning is that it can be analyzed theoretically and at same time can achieve a good performance when applied to real problems. This paper will describe analytically the using of SVM to solve pattern recognition problem with a preliminary case study in determining the type of splice site on the DNA sequence, particularity on the generalization capability. The result obtained show that SVM has a good generalization capability of around 95.4 %.Keywords: Support vector machine, generalization test, pattern recognition, splice sites, DN

    Studi Komparasi Terhadap Kapabilitas Generalisasi dari Jaringan Saraf Tiruan Berbasis Incremental Projection Learning

    No full text
    One of the essences of supervised learning in neural network is generalization capability. It is an ability to give an accurate result for data that are not learned in learning process. One of supervised learning method that theoretically guarantees the optimal generalization capability is incremental projection learning. This paper will describe an experimental evaluation of generalization capability of the incremental projection learning in neural networks%2C called projection generalizing neural networks%2C for solving function approximation problem. Then%2C Make comparison with other general used neural networks%2C i.e. back propagation networks and radial basis function networks. Base on our experiment%2C projection generalizing neural networks doesn%5C%27t always give better generalization capability than the two other neural networks. It gives better generalization capability when the number of learning data is small enough or the noise variance of learning data is large enough. Otherwise%2C it does not always give better generalization capability. Even though%2C In case the number of learning data is big enough and the noise variance of learning data is small enough%2C projection generalizing neural networks gives worse generalization capability than back propagation networks Abstract in Bahasa Indonesia : Salah satu hal yang penting dari suatu metode pembelajaran pada jaringan saraf tiruan adalah kapabilitas generalisasi. Yaitu kemampuan untuk memberikan hasil yang akurat terhadap data yang tidak diajarkan pada tahap pembelajaran. Salah satu metode pembelajaran yang memberikan jaminan secara teori diperolehnya kapabilitas generalisasi yang optimal adalah projection learning. Pada tulisan ini kami akan melakukan evaluasi eksperimental terhadap kapabilitas generalisasi dari jaringan saraf tiruan berbasis projection learning yang bersifat incremental%2C yang disebut projection generalizing neural networks%2C untuk memecahkan masalah aproksimasi fungsi. Kemudian melakukan studi komparasi dengan jaringan saraf tiruan yang sudah umum digunakan%2C yaitu back propagation networks dan radial basis functions networks. Berdasarkan hasil uji coba komputasi yang kami lakukan%2C projection generalizing neural networks tidak selalu memberikan kapabilitas generalisasi yang lebih baik. projection generalizing neural networks memberikan kapabilitas generalisasi yang lebih baik ketika jumlah data pembelajaran cukup kecil atau variansi noise dari data pembelajaran cukup besar. Selain dari dua kondisi tersebut%2C projection generalizing neural networks tidak selalu memberikan kapabilitas generalisasi yang lebih baik. Bahkan%2C untuk kondisi dimana jumlah data pembelajaran cukup besar dan variansi noise cukup kecil%2C projection generalizing neural networks memberikan kapabilitas generalisasi yang lebih buruk dari back propagation networks. supervised+learning%2C+incremental+projection+learning%2C+generalization+capability%2C+artificial+neural+networks%2C+function+approximation+proble
    corecore